Advanced Query Functions

The various advanced query functions in Discover allow the user to apply advanced analytic logic and calculations to a query (and its related visualization) to produce deeper insights that go well beyond the slicing, dicing and filtering of data (see query interactions and query functions). Advanced functions in Discover often employ statistical techniques with or without classic machine learning algorithms. In some cases, the logic can be fully customized using Python or R.

Advanced Analytical Methods in Discover vs Model

The advanced query functions exposed in Discover are appropriate for applying logic quickly on query result sets and do NOT require a change to the underlying database and / or data model. Discover is not the right location to host and develop ML logic that typically runs for much longer, involves model training and usually requires write back to the database. Instead, these belong in a task based, asynchronous data processing engine - which is what the Model app in the professional client is designed to handle.

Note: Most of the advanced analytical methods described below rely heavily on the 'context calculation' engine in Discover. Context or "Report" functions allow users to build calculations that can only be resolved efficiently in the context of a visualization or report.

Functions

  • Forecasting - The Forecasting tools are used to project future values in a time series, such as corporate sales over the next four quarters.
  • Regression - Regression functions allow users to compare two or more metrics across a range of items to see their relationship.
  • Statistics - The Statistics functions allow users to add and expose classic statistics to the current query
  • Pareto Analysis - The Pareto function allows users to Pareto values (metric accumulation as a percentage of the total).
  • Clustering - The clustering functions involves statistically calculating a classification and grouping the current query's data points.
  • Outlier Analysis - The Outliers tool employs a context calculation to show which data points in the query are statistically in or outside the bounds of the data set mean.